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A MODEL FOR FORECASTING SOVIET GRAIN YIELDS 


INTRODUCTION 

1. For many years, this Office has assessed the progress of the Soviet grain 
crop during the growing season. Independent assessments are necessary mainly 
because the USSR Ministry of Agriculture provides only the most general situation 
reports during the growing season. Crop results are reported several months after 
the completion of the harvest. The tautness of the world food supply and the con- 
sequences of the wide swings in Soviet grain production have increased the need 
for timely forecasts of Soviet grain output. 

2. This publication documents the development of a model to predict grain 
yields in 33 crop regions (see the map). The predicted yields — based on time 
trends and a composite index of several weather variables — are combined with 
reported data on sown area to obtain crop estimates. With the help of the model, 
crop estimates can be made as early as April and revised to take account of 
additional information until the harvest is completed. 

3. The models structure, computer programs, and weather data base can be 
applied to any crop for which yield data are available. So far, the model has been 
used to forecast yields for all grain, winter wheat, and spring wheat. 1 To estimate 
the Soviet grain crop, estimation of sown area and adjustments to reflect collateral 
information are also used. 


PRINCIPAL FINDINGS 

4. A weather-yield model has proved to be useful in estimating Soviet crop 
yields. The predicted total 1974 grain harvest was 2% above the actual harvest. 
The model can produce reasonably reliable estimates early in the season and can 
then be revised every ten days as new weather data are received. 2 

5. Data on production, sown area, and yields in 1958-73 have been collected 
for all grain, winter wheat, spring wheat, winter rye, and spring barley. A weather 
data base covering the period from 1961 to the present has also been assembled 
for use with the model. Relying on this data base, the model employs a weather 
index — representing the influence of temperature and precipitation — and a time 
trend — representing technological change — as explanatory variables in a set of 
independent linear equations that predict crop yields. 


STATINTL 


Note: 
directed to | 
Extension 


Comments and queries regarding this publication arc welcomed. They may be 

If the Office of Economic Research, Code 143, 


1 In this publication, the official Soviet definition of all grain is used — including wheat, 
rye, barley, oats, corn, rice, millet, buckwheat, and pulses. 

a This model was developed in early 1973 and formed the basis of CIA estimates for 
that year. The model was modified slightly in 1974. This publication is limited to the current 
model, and all data refer to it unless otherwise stated. 


1 

Approved For Release 2000/09/14 : CIA-RDP86T00608R000500230008-5 



Approved Foi6R£le9l^G0O^O90t4^ USSR 

has announced a total grain harvest of 195.5 million metric tons and two republic 
harvests — 111.6 million tons for the RSFSR and 40 million tons for the Ukraine. 
The models final prediction (end of August) for the USSR was 199.4 million, vons, 
within 2% of the total announced in December. 51 In September, Politburo Chair- 
man Brezhnev said the crop would be “not bad” but admitted that areas in, Liberia 
and Kazakhstan were having trouble. The announcement of a crop of 195.5 million 
tons confirms these indications. 

7. The prediction for 1974 also seems reasonable in terms of its development 
over time and its forecast of regional trends. The' prediction increases \ sharply 
during May and June, when the weather was favorable, and then declined as con- 
ditions deteriorated. The final predictions for the RSFSR and the Ukraine (115.3 
million tons and 47.9 millions tons, respectively) are slightly higher thru the pub- 
lished totals (3% and 4% ). The prediction for the remaining portion c * the USSR 
is 4% below the actual harvest. 

8. Although the model describes the historical data quite well, prediction 
errors can be significant. Tests show that an error of 2% to 8% in lie predicted 
national yield can be expected. An error in estimates of the sown a ea can either 
compound or mitigate the effect of errors in predicted yields. 


FORMULATION OF THE MODEL 

Factors Influencing Yields 

Weather 

9. Although the importance of weather in determining grain yields is ap- 
parent, the causal relationships are difficult to define. Many individual weather 
variables affect the final yield. Moisture, for example, influences the number of 
plants per hectare, the number of stalks per plant, leaf area, the number of heads 
per plant, the number of kernels per head, and the 'weight per kernel. Temperature 
variations speed up or curb growth and change the water balance in the plant. 
Extreme temperatures may injure or kill the plant. 

10. Other weather variables that influence yields include sunlight, humidity, 
hail, and wind. In particular, the hot sukhovey (dry wind 4 ' prevalent in the east- 
ern grain-growing regions of the USSR greatly increase evaporation and trans- 
piration, and at times even beat down a crop. Weather vr tables also affect yields 
indirectly because of the importance of good weather in r eldwork at planting and 
harvest times and because of their role in encouraging o retarding plant diseases, 
parasites, and weeds. 

11. Specific patterns of interaction between weather variables and the physio- 
logical requirements of plants (and hence yields) vary considerably according 
to the peculiarities of local climates. Nonetheless, certain common factors are 
important. In the early stages of growth — at least nrough tillering — grain requires 
a continuous supply of moisture in the upper layer of the soil. After tillering, which 
normally occurs a little less than 30 days after sowing, the period of rapid vegeta- 
tive growth occurs. As stems and leaves develop, consumption of water by the 
plants increases greatly. The heading stage, which is critical to grain yields, usually 

* The predictions represent gross grain production — the grain obtained from the harvesting 
machine in the field, including excess moisture, unripe and damaged kernels, weed seeds, 
and the losses in Handling and transporting the grain between the field and storage facilities. 
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occurs a little more than 30 clays after tillering. 1 The relatively high temperatures 
that generally prevail in the USSR during this stage result in increased transpira- 
tion and in the move rapid depletion ol soil moisture by evaporation. At that stage, 
therefore, the plants require more rainfall than during earlier stages ol develop- 
ment. 

12. After heading, the plants dependence on moisture and its sensitivity to 
temperature decrease. Excessive rainfall still may damage the crop, however, by 
causing lodging (matting) and by promoting diseases — such as rust, scab, mildew, 
and leaf spot — as well as weed growth. As in the earlier stages of growth, excessive 
temperatures may be injurious, especially if accompanied by dry winds. Too low 
a temperature preceding the harvest may delay ripening so that the crop is caught 
by early frost. 

jl3. Links between weather factors and the stages of growth cannot be dctii jd 
with precision because of annual variations in sowing dates and in the seasonal 
pattern of growth among different grains and areas. For example, the same grain 
in the same area may be planted at substantially different times in different years, 
and separate varieties planted in the same region may grow at different rates. 


Technology 

14. Crop yields have increased steadily since 1958 in many regions of the 
USSR but very slowly in other regions ( see Table 1 ). 5 The systematic improvement 
that has occurred probably reflects increased use of agricultural chemicals — 
fertilizer, lime, and insecticides — and better and more timely cultivation and 
harvesting. All of these changes arc technological improvements. Unfortunately, 

4 These stages of d'.*/eiopnient refer to spring-sown small grains. Fall-sown winter grains 
usually tiller before entering dormancy. 

5 Data for earlier years could not be used, because in 1958 the official measure of the 
grain crop is believed to have changed over to a new basis. I or a discussion of this question, 
see Eberhard Schinke, ‘‘Soviet Agricultural Statistics” in Vladimir G. Trend and John P. Ilardt 
(eds.), Soviet Economic Statistics , p. 242. 


Tabic 1 

USSR: Grain Yields 


All Grain 


Year 

USSR 

RSFSR 

Ukraine 

Kazakhstan 

1058 

11.1 

10.0 

10.8 

0 . 5 

1050 

.. 10.4 

0.0 

10.1 

8.7 

1000 

.. 10.0 

10.7 

15.8 

8.5 

1001 

10.7 

0.0 

10.0 

0.0 

1002 

10.0 

11.0 

17.0 

0.5 

1003 

8.3 

8.3 

12.0 

4.4 

100-1 

11.1 

10.7 

17.7 

0.8 

1005 

0.5 

0.0 

10.2 

3.1 

1000 

13.7 

13.1 

21 .5 

10.8 

1007 

12.1 

11.0 

20.5 

0.3 

1008 

.. 14.0 

14.7 

18.5 

8.4 

1000 

.. 13.2 

12.2 

23.0 

8.8 

1070 

.. 15.0 

15.0 

23.4 

0.8 

1071 

15.4 

14.0 

25.4 

0.4 

1072 

.. 14.0 

12.5 

21.2 

12.5 

1073 

17.0 

10.8 

20.0 

11.2 


Centners per Hectare 


Winter Wheat Spring Wheat 


USSR 

RSFSR 

Ukraine 

USSR 

RSFSR 

Kazakhstan 

10.2 

17.5 

17.0 

0.7 

0.0 

0.0 

15.2 

13.5 

18.0 

0.4 

0.8 

8.8 

15.1 

10.1 

17.5 

0.5 

10.2 

8.4 

10.0 

1 5 . 0 

21 .0 

8.2 

0.0 

0.0 

10.8 

18.0 

17.0 

8.2 

0.2 

0 . 5 

12.0 

13.4 

14.2 

5.0 

7.0 

3.8 

13. S 

12.0 

17.0 

0.0 

0.0 

0.0 

10.1 

14.1 

21 .3 

5 . 5 

0 3 

3. 1 

20.4 

10.0 

24.8 

12.0 

12.0 

11.0 

17.8 

15.7 

23 . 1 

8.0 

10.5 

0.1 

18.3 

10.0 

20.5 

12.2 

14.4 

8.3 

18.0 

15.5 

23.5 

50.1 

11.0 

8.0 

22.8 

23.0 

20.0 

12.3 

13.8 

0.0 

23.1 

20.8 

20.0 

11.8 

13.1 

0.4 

10.0 

10 7 

25.5 

13.0 

13.1 

12.9 

27.0 

25 . 2 

31.0 

13.5 

14.8 

11.2 
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information on such improvements is incomplete and, when available, represent 
national or republic totals— not the individual crop regions required for the crop 
prediction model. 


15. As a surrogate for technological improvement, a time trend was tested 
in all crop regions and used when! a high correlation between yields and time was 
found. In the districts where time was not correlated with crop yields (mainly 
the spring wheat belt of Western Siberia and Kazakhstan), new highei yielding 
varieties have not been introduced, and normal precipitation is inadequate lor 
efficient use of most types of fertilizer. 


Availability of Data 

16. r Ihe weather data used in the prediction model include observations on 
precipitation and temperature for each of 27 crop regions from 1961 to 1973.° The 
observations arc; monthly averages except during the growing season, when aver- 
age precipitation readings are a\ tillable lor ten-day periods. Weather data arc not 
available for crop regions 28-33. 

17. The yield data available cover all grain, spring wheat, winter wheat, 
winter rye, and spring barley for the years 1958-73. Yields were calculated from 
published data for the oblasts within a crop region, except for those few crop 
regions that coincide with the areas for which the USSR reports yields. 

18. The weather data have their shortcomings. In the early years of collec- 
tion — roughly 1961-65 — data were transcribed by hand from weather maps, lead- 
ing t'j errors in transcription. Only the most obvious of these errors could be 
corrected. More important, the observation for a given crop region is simply an 
unweighted average of the observations from all of the weather reporting stations 
in that region because of the lack of data on sown area below the oblast level. 
Thus, the weather observations in the major crop areas within a crop region were 
not accorded a proportionately greater weight. 

19. Errors in the yield data can arise from inconsistencies in official Soviet 
statistics, the need to estimate unpublished data, and the need to estimate the 
»atio of sown area to aggregate yields. Although many inconsistencies appear in 
published Soviet sources, most involve a variance of only 0.1 centner per hectare. 
Wheat yields for the RSFSR are usually reported only for spring and winter wheat 
combined. By using additional information, yields for spring and winter wheat 
could be estimated separately. The estimates that could be checked were quite 
accurate; hence, any distortion is believed to be small. For some crops, such as 
spring barley, yield data were available for each division within a crop region but 
sown areas were unknown. The yield data therefore had to be aggregated by 
using estimated weights. 7 

The Prediction Model 

Specification 

20. The prediction model assumes that grain yield is a linear function of 
technology and weather. A linear time trend represents the influence of improved 
technology — for example, fallowing and increased use of fertilizers — and improved 

" Although weather data have been collected since 1958, complete series for all variables 
used in the model are available only since The appendix describes the sources of weather 

and yield data. The 27 crop regions are identified on the map. 

7 These errors are discussed in more detail in the appendix. 


Approved For Release 2000/09/14 : CIA-RDp>86T00608R000500230008-5 


Approved jfft n c 

prises a weather index that represents the influence of weather on yields. 


oin- 


21. The Soviet grain yield prediction model is a set of 33 independent linear 
equations: 

0 ) Y% = c* iJ + c*, j T 1j + c* l . j V* 1j 

where Y* = predicted yield, 

T = time, 

V* : =the weather index, 

c*„j» c*u, and c*mj are estimated model parameters, 
i=year (1958 to 1974), and 
j=crop region (1 to 33). 

Estimates of sown area for the 33 crop regions are multiplied by the yield 
estimates for the crop regions to obtain forecasts of total production and the 
average yield. The estimates of sown area are based on Soviet press reports and 
past trends in each crop region. 


22. The eleven weather variables used to form the weather index are: 

• accumulated precipitation in millimeters from October through March; 

• monthly precipitation in millimeters for April, May, June, July, and 
August; and 

• average monthly temperatures in degrees centigrade for April, May, 
June, July, and August. 


Because weather data arc not available for crop regions 28 through 33, c^ in 
equation (1) assumes the value of zero in these six crop regions. The weather 
variables listed above were selected from a set of 54 standard weather variables 
now available for weather-yield regression work on the USSR (see the appendix). 
In addition, new variables can be formed by combining standard variables. 


Estimation 

23. We estimated the Soviet grain yield prediction model in four steps. 
First, we estimated the effect of a time trend on yield in each of the 33 crop 
regions. We then estimated the influence of the weather variables, constructed 
the weather index, and finally estimated the parameters, c*, (J , c* 1J} and c*«j in 
equation ( 1 ) . 

24. In step one, we estimated the parameters a 0 j and au shown in equation 
(2). We derived these estimates by performing linear regressions for the 33 
time-yield equations: 

(2) Yjj=a„j + au Ty+cij 
where Yjj = reported yield, 

i = year (1958 to 1973), 
j = crop region (1 to 33), and 
C|j = an unexplained residual. 

The calculated yields and reriduals from equation (2) are 

(3) Y** IJ = a* oJ + a*i J T% 

(4) e* lJ = Y, J -Y**i J 


6 
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The estimated time (mid coefficient ( si* jj ) was not significantly different from 
zero for many eastern grain-growing regions, Therefore, the value of e\j in 
equation (1) is set at zero for these regions. 8 

25. In (In* second step, we regressed the residuals calculated in equation 
(4) above on J 1 weather variables to obtain an index of the influence of 
weather. rp hc available weather data for each crop region arc limited to 13 
observations (1961-73) for crop regions 1-27. Therefore, in order to have enough 
observations to estimate the weather index, we arranged these 27 crop regions 
into 13 groups. We then computed the following regression equation: 

11 

(o) 0*1 Jin !),„„+ ^ < bkm Wi h ,n, + U„m 

k= I 

where 0*^,1 == the estimated residual from equation (4) for year i in 
crop region j — which is assigned to one of the 13 
regional groups (m), 

W| UJ|II = weather variable k (lc= 1, . . . , 11), and 
iiij m = an unexplained residual. 

26. The aggregation of the data into 13 groups of crop regions is called 
pooling the data. 8 The use of a single regression equation to express the relation- 
ship of yield to weather variables for a multi-region area implicitly assumes that 
the weather variables have the same influence on yield in each of the regions 
grouped. Thus, it was necessary to select group ngs of crop regions with similar 
climates, soils, and cultivation practices. 

27. In the third step, we calculated the weather index V*u for each crop 
region. Given the parameters b*u,„ (k = 0, . . . , 11) estimated from equation (5), 
the weather index was computed as: 

11 

(6) V*ij = !)*,„„+ 2 b*.. m Wmj 

k=I 

where V*ij= weather index for crop region j in year i, 

W,;ij= weather variable k for crop region j in year i, 

and all 27 crop regions (j) arc assigned to one of 13 
regional groups (m). 

28. In the fourth step, we estimated the coefficients c„j, Cjj, and c L .j 
(j = l , . . . ,33) in the model’s grain yield prediction equation (1) by performing 
multivariate linear regression using equation (7) and the values for V*u estimated 
in equation (6). 

(7) Yij=e, 1J +c, J Tij + cuj V*,j4 iiij 

The Pattern of the Regression Coefficients 

29. The signs of the estimated coefficients of the weather regression — 
equation (5) — are shown in Table 2. The coefficients for October-March pre- 
cipitation are generally positive (16 of 27 crop regions). Preseason precipitation 
would be expected to have a positive effect on yield, especially in the winter 
grain areas. Nonetheless, many of the negative coefficients are in winter grain 

H Crop regions 6, 10, and 20-27. 

H Pooling data is the use of cross-section data with time series data. Pooling the data 
increases the number of observations used in estimating the parameters in each equation and 
hence increases the number of degrees of freedom. 
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Table 2 

Coefficient Signs of Weather Variables 1 



Precipitation 


Temperature 

Regions 

Oct- Mur Apr May Jun 

Jill 

Aug Apr May Jun Jul Aug 

1,2 

:i 

i 1 1 

i i N . A. 

N.A. 

N.A. i f N.A. N.A. N.A. 

•i, 5, o.... 

» f 

* 1 1 


i t 

1), 10 

♦ if 


ill * 

11, 12.. . . 

1 •! 1 ( 

i i - i 

i 

» i t 

15, IS. . . . 


1 

|. i 

i ( t 

If), 10 

17 

i ill 

i - i- N . A. 

N.A. 

N.A. i i N.A. N.A. N.A. 


20, 20. 27. 


21 , : 
21 , 


20 . 


i These coefficients resulted from using the 
of crop regions Riven, Regions 0 and 17 were 
variables. 


11 weather variables enumerated with the grouping 
estimated separately with a reduced set of weather 


areas. Precipitation seems to be important in April, May. and June, with the 
coefficients generally positive. The coefficients for April temperature arc mostly 
positive, but several turn negative in May and June, suggesting that very hot 
weather at the onset of heading is not desirable. The need for dry weather at 
harvest time clearly shows up in the winter grain areas in July, as most precipita- 
tion coefficients shift from positive to negative. The spring grain areas retain 
their positive coefficients in July, but two of them become negative in August. 

30. The results presented in Table 2 arc generally encouraging, although 
the signs do not correspond with agronomic theory in several instances. Dis- 
crepancies could arise for a number of reasons: 

• Although weather and time were assumed to be unrelated, in fact 
weather may be cyclical. Since 1961, precipitation in the USSR has 
tended to increase and average temperature has declined slightly, 
affecting the estimated time trend in equation (2) and biasing the 
weather variable coefficients. 

• Weather variables also may be related within the same year, contrary 
to the models assumption that the weather variables in equation (5) 
are independent. Colinearity of this nature could easily cause a sign 
reversal in the weather coefficients. 

• Several of the weather variable coefficients were statistically insignifi- 
cant. 10 There arc three likely causes: (1) a variable truly is not re- 
lated to yield, (2) a variable is colinear with another weather variable 
and therefore does not receive its full credit, 11 and (3) the small number 
of observations precludes an accurate test of significance. 

10 a coefficient is statistically insignificant if it cannot he stated with a high probability 
( usually 95% ) that the true value of the coefficient is non-zero. 

11 a number of tests were made by reestimating equation (5) after excluding one or 
more insignificant weather variables. These tests did not show any important sign reversals, 
hut did sharply improve the significance of the remaining variables. This indicates the presence 
of colinearity among the weather variables. 
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w rnc types orgiain grown within each crop region and in the nation 

change over lime because of weather, shifts in demand, and technologi- 
cal change. I he coefficients of the weather equation reflect changes 
in the structure of the sown area and therefore are unlikely to apply 
fully in any particular year. 


PREDICTIONS FOR 1974 

31. The esc of the model to forecast grain yields began in April and con- 
tinued throughout the summer months. The forecasts were updated every ten 
days, as new weather data were received. Two basic types of forecasts were 
made during the 1974 crop year. The first used only weather variables for 
which 1974 data had been received. For example, the forecast made as of the 
end of April did not include the influence of weather in May, June, July, and 
August. The second type used ad weather variables, substituting long-run norm 
values for variables not yet reported. The results reported here are al 1 from the 
second type of forecast. 

32. In 1974 the estimate of grain production ranged from 184 to 203 million 
tons. A preseason estimate of 185.9 millions tons was based on longrun norm 
values for all weather variables. The tabulation below shows how the predic- 
tions changed as additional weather data were received. 

Predicted Yield Predicted 1974 

(Centners per Grain Production 


Weather Through Hectare) (Million Tons) 

Preseason 14.5 185.9 

March 14.6 186.3 

April 14.4 184.2 

M«y 14.8 189.9 

i"nc 15.9 p()3.3 

J«!y 15.5 197.8 

August 15.6 199.4 


The development of the harvest prediction becomes clearer when republic 
estimates are examined. The gains made in Kazakhstan through April were more 
than canceled out by poor weather in the rest of the country. In May, Kazakhstan 
fared poorly, but developments in the Ukraine easily compensated for Kazakh- 
stan s diminished prospects and provided the first hope for a good year. In June, 
predicted yields in the Ukraine continued to increase, and those in the RSFSR 
made a very large jump as a result of abundant rains. Predicted harvests in all 
areas fell off in July and recovered slightly in August. The final line in the 
tabulation of predicted 1974 grain yields below shows’ the production as reported 
in December 1974. The predicted values for the RSFSR and the Ukraine are 
3.7 million and 1.9 million tons too high.. Since this is more than the difference of 
3.9 million tons for the USSR prediction, the prediction for Kazakhstan and other 
areas must be too low. 


Millions Tons 


Weather Through 

RSFSR 

Ukraine 

Kazakhstan 

Other 

Areas 

Total 

Preseason 

. . 109.4 

39.6 

19.8 

17.1 

185.9 

March 

. 109.2 

40.3 

19.8 

17.0 

186.3 

April 

. . 107.8 

39.1 

21.8 

15.5 

184.2 

May 

. . 108.8 

46.0 

19.1 

16.0 

189.9 

June 

. . 116.3 

48.6 

21.0 

17.4 

203.3 

July 

. . 114.0 

47.1 

20.5 

16.2 

197.8 

August 

. . 115.3 

47.9 

19.5 

16.7 

199.4 

Reported production . 

. 111.6 

46.0 

N.A. 

N.A. 

195.5 
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33. T almlations ol the progress ol (lie predict ions lor winter and spring 


mat indicate that 

winter wheat caused 

the increase in the predicted 

grain crop 

May, and 

spring 

wheat caused the large jump in June. 




WINTKR 

WIIKAT 






Million 

‘Pons 


Weather 1 

through 

RSFSR 

Ukraine 

Ollier Areas 

Total 

Preseason 


10.2 

20.1 

5.0 

41.9 

March 


15.7 

20.1 

5.0 

4 1 

April 


14.3 

20.3 

5.4 

40.0 

May . 


15.8 

1 22.9 

5.7 

44.4 

June 


10.3 

23.4 

5.9 

45.0 

July ... 


14.4 

22.9 

5.7 

43.0 



SPRING 

WIIKAT 






Million 

Tons 




RSFSR 

Kazakhstan 

Other Areas 

Total 

Preseason 


32.1 

12.3 

0.1 

44.5 

March 


32 2 

12.4 

0.1 

44.7 

April . 


32.0 

13.1 

0.1 

45.8 

Mnv . 


32.4 

1 1.0 

0.1 

44.1 

June . . . 


30.2 

12.0 

0.1 

48.9 

July . 


35.9 

11.0 

0.1 

47.0 

August 


30.0 

10.0 

0.1 

47.3 


EVALUATION OF THE MODEL 

34. The value of the model was tested by computing its prediction errors 
using historical data and by comparing its performance with other models. First, 
we deleted data and reestimated the model coefficients from the remaining 
data. 12 We predicted crop yields from equation ( 1 ) for each of the 33 crop regions. 
Next we computed an average USSR yield using reported sown areas. We then 
computed the difference between our predicted USSR yield and the yield re- 
ported by the USSR Central Statistical Administration. The following tabulation 
shows the direction and percentage magnitude of the prediction errors, i.e., 

(predicted yield — reported yield) 
reported yield 



estimated sown area of 128 million hectares. 

13 1 his procedure resembles the way the model would be used in practice. As data for 
additional years become available, the model would be reestimated. 
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’I'liis lest siwsls that an emir of Ix'twmi 2% and H'/< .should l>o oxpoolod— 
llio raiiKo of Iho prooonlaj'o mors .shown on (hr diagonal for (hr years l!)fi()-7-l. 
W hilr (hr upper pari ol (his ranj^r is too liij'h from (ho standpoint oi making nso- 
Inl foiooasls, 1,1 oxporionoo has shown dial oollatrral information ran hr used to 
delect anomalies and reduce llie error. 


35. Next we constructed several naive* models to determine whether our 
model provides belter predictions, hven the best ol these naive models did 
not perform well enough to merit consideration. In the best of the naive models, 
yield was a linear (unction ol time, past yields, and selected weather variables. 
I h(‘ data for all crop regions lor which weather data were available (regions 
1-27) were used in the same equation. This equation was able to describe 
historical yields (airly well (the coefficient of determination was 0.8), but it 
did not capture the cyclical behavior of Soviet agriculture, and the average! 
absolute percentage prediction error was almost double that of our disaggregated 
model, based on predictions for 1060-72. 

36. In evaluating the model described in this publication (and especially the 
predictions lor 107*1), the fact that production forecasts depend on the accuracy 
of sown area must be kept in mird. Although actual sown areas have not been 
reported for 1974, the errors in estimates of total sown areas in 1073 ranged 
from 2.1 /J lor all grain to 13.1% for winter wheat, as shown below. 


Million Hectares 

Estimates Used in Percentage 

Crop Actual Prediction Model Error 

All grain 12(17 124.0 - 2.1 

Spring wheat 44.8 47.0 4,9 

Winter wheat 18.3 15.9 -13.1 


Sown area data by crop region in 1973 have been received only for the Ukraine. 
A comparison of actual sown areas in the Ukraine in 1973 with those estimated on 
the basis of laborious review of published information also shows substantial 
errors, as follows: 


All Crain Winter Wheat 



Million 

I lectures 

Percentage 

Million 

Hectares 

Percentage 
Error 1 

Crop Region 

Actual 

Estimated 

Error 1 

Actual 

Estimated 

Total Ukraine 3 

10.6 

15.2 

- 8.8 

8.3 

7.0 

-10.4 

Western Ukraine . . 
North Central 

2.5 

2.3 

- 6.9 

1.2 

1.2 

- 3.3 

Ukraine 

4.5 

4.2 

- 6.6 

2.1 

2.0 

- 3.9 

Northeast Ukraine . 

2.0 

2.4 

- 7.4 

1.0 

1.2 

! 1.9 

Eastern Ukraine , . 

3.0 

3.3 

- 9.2 

1.9 

1.0 

-47.9 

Southern Ukraine . . 

3.5 

3.0 

-13.5 

2.2 

1.7 

-22.8 

1 Percentages were 

derived from unrounded d’ *o. 





3 Because of rounding, components may not atkl to the totals shown. 

Knowledge of the actual distribution of the 1973 Ukrainian sown area would not 
have greatly improved the estimate of the overall Ukrainian yield (from 28.76 
to 28.82 centners per hectare for winter wheat), but the error in the predicted 
total harvest would have been sharply reduced (from -24.8% to -9.7% for 
winter wheat). Thus, the effect on crop estimates of errors in the estimates of the 
regional distribution of sown area appear to be much less than the effect of 
errors in estimates of the total sown area. 


w An error of 8% applied to a national yield of 15 centners per hectare and a sown area 
of 128 million hectares, results in a harvest error of 15 million tons. 
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M. Annoiign the model described in tins publication presumably can he 

improved as data on weather and yields accumulate, it must ho used with care. 
The essence of tin* model is a set. of equations which estimate a linear relation- 
ship between the yield of a given crop and a number of weather variables and a 
lime trend. There arc* several problems inherent in the model: 

• The time trend is used as a surrogate for factors that the model, because 
of lack ol data, cannot account for specifically, 

• In any event, since the time trend averages the long-term effect of 
otiscr variables, it will not reflect their influence* accurately in any 
given year. The impact of a sharp increase in the supply of fertilizer 
in a given year, for example, would not he reflected in the results of the 
model. 

• The weather indexes measure the influence of weather variables in 
an "average” year, but each year is different with respect to the 
timing of crop developments. 

• The assumed linear influence of the weather variable's undoubtedly 
fails to capture the effects of extreme weather accurately. 

38. The range of the expected prediction error can probably be narrowed 
by introducing nonlinear variables, additional experiments with the grouping of 
crop regions, or other weather variables. Ai. o, the data base will gradually in- 
crease, adding to the number of observations available to estimate the parameters 
of the weather variable. Despite these improvements, it will still be necessary 
to consider collateral information about variations in the crop calendar, harvesting 
conditions, use of fertilizer, and amounts and quality of seed available and to 
adjust the model predictions accordingly. 
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APPENDIX 

DESCRIPTION OF THE DATA BASE 


Weather Data 

Four basic types of weather data covering the period from 1961 to the 
present are currently maintained nr; computer files for use in weather-yield 
regressions — precipitation by decade 1 for the months of April through September, 
total precipitation by month, average monthly temperature, and soil moisture as 
of the last day of each month. 

Sources and Reliability 

Basic precipitation and temperature data for individual weather stations in 
the USSR are broadcast two to four times daily by Soviet radio stations. 2 From 
these broadcasts, weather data are interpolated for some 134 grid points spaced 
roughly 75 miles apart in a square pattern throughout the area studied. 
Periodically, total precipitation, mean temperature, and soil moisture are calcu- 
lated for each grid point and aggregated into values for crop districts. 

Reliability of the weather data is primarily a function of the density of 
coverage — that is, the number of individual weather stations from which reports 
emanate affects the interpolated data for a given grid point. The density of 
coverage is most critical for the precipitation values because there can be con- 
siderable variation in precipitation over a relatively small area, especially in 
hilly country. Temperature values, on the other hand, are considered to be 
more continuous. 

Before April 1965 the density of coverage averaged one reporting weather 
station per grid point, largely because of limitations imposed by the time- 
consuming manual method of interpolation used. Since that date, computer 
analysis has applied a complex method of interpolation that permitted increasing 
the density of coverage to four or five reporting weather stations per grid point. 
Consequently, the interpolated values for precipitation and temperature since 
April 1965 are considerably better indicators of weather conditions at the grid 
points than those for earlier years. 

Storage and Handling of Weather Data 

The basic weather data are received at approximately ten-day intervals 
and stored on a computer disk file. Six categories of weather variables are 
stored in the file in the following order: 

1. First decade precipitation 

2. Second decade precipitation 

3. Third decade precipitation 

4. Total monthly precipitation 

5. Average monthly temperature 

6. Soil moisture (last day of month) 

1 Each month is divided into three periods called “decades.’' 

a As a member of the World Meteorological Organization, the USSR shares such informa- 
tion with foreign countries. 
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Kadi of (he tlnvo decade categories contains only six variables (representing the 
months ol September and April through August) while tin* remaining three 
categories contain 12 variables each (one for each mouth of tin* year). A complete 
list of the weather variables is shown in Table A-L 


TABLE A-l 

STAN DA 1U) WE ATI 11CH VAHIABLES AVA I LABI.-K 

Number Name of Variables 

1 Prec.pitation, 1st Decade of September 

2 Precipitation, 1st Decade of April 

3 Precipitation, 1st Decade of May 

4 Precipitation, 1st Decade of June 

5 Precipitation, 1st Decade of July 

3 Precipitation, 1st Decade of August 

7 Precipitation, 2d Decade of September 

3 Precipitation, 2d Decade of April 

9 Precipitation, 2d Decade of May 

10 Precipitation, 2d Decade of June 

11 Precipitation, 2d Decade of July 

Precipitation, 2d Decade of Align' t 

13 Precipitation, .'3d Decade of September 

14 Precipitation, 3d Decade of April 

15 Precipitation, 3d Decade of May 

13 Precipitation, 3d Decade of June 

1 7 Precipitation, 3d Decade of July 

18 Precipitation, 3d Decade of August 

19 Precipitation — September 

29 Precipitation — October 

21 Precipitation — November 

22 Precipitation — December 

23 Precipitation — January 

2 4 Precipitation — February 

25 Precipitation — March 

20 . Precipitation — April 

27 Precipitation — May 

28 Precipitation — June 

20 Precipitation — July 

3 0 Precipitation — August 

3 * Average Temperature (Absolute Value )- 

32 Average Temperature (Absolute Value )- 

33 Average Temperature (Absolute Value )- 

3 d Average Temperature (Absolute Value )- 

33 Average Temperature (Absolute Value)— 

33 Average Temperature (Absolute Value)— 

37 Average Temperature (Absolute Value)— 

33 Average Temperature (Absolute Value)— 

30 Average Temperature (Absolute Value)— 

40 Average Temperature (Absolute Value)— 

41 Average Temperature (Absolute Value)— 

42 Average Temperature (Absolute Value)— 

43 Soil Moisture ( Lart Day of the Month )- 

44 Soil Moisture (Last Day of the Month )- 

43 Soil Moisture (Last Day of the Month )- 

43 Soil Moisture (Last Day of the Month)- 

47 Soil Moisture (Last Day of the Month )- 

48 Soil Moisture (Last Day of the Month )- 

49 Soil Moisture (Last Day of the Month )- 

39 Soil Moisture (Last Day of the Month)- 

Soil Moisture (Last Day of the Month)- 

52 Soil Moisture (Last Day of the Month)- 

53 Soil Moisture (Last Day of the Month )- 

54 Soil Moisture (Last Day of the Month)— 


-September 

-October 

-November 

-December 

-January 

-February 

-March 

-April 

-May 

-June 

-July 

-August 

-September 

-October 

-November 

-December 

-January 

-February 

-March 

-April 

-May 

-June 

-July 

-August 
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The data bases for harvest and sown area have been assembled lor live 
grain crops: (1) all grain, (2) winter wheat, (3) spring wheat, (4) spring 
barley, and (5) winter rye. These data bases are computerized and will be re- 
ferred to as a “file.” Much file has a common structure; all arc* processed by the 
same computer program. The data sources art* not the same, however, and some 
data have been estimated. 

Data Required 

For each crop, data are required for all 33 crop regions. The 33 crop regions 
are composed of 127 administrative divisions — 12 Soviet Socialist Republics 
(SSRs), 16 Autonomous Soviet Socialist Republics (ASSRs), 93 oblasts, and 
6 krays. For each of the 127 divisions, data on tin* size of the harvested crop in 
thousands of tons and the harvested area in thousands of hectares are needed. 
The computer program computes yields for each of the 127 divisions in centners 
per hectare, sums the harvested crop and the harvested area for all of the divisions 
within a crop region, and computes the yield for each crop region. 

Data Sources 

All Grain: Data on production and sown area for all grain are usually avail- 
able from Soviet statistical handbooks published for the area and year of interest. 
Although data for Kazakhstan in 1962-64 and 1969-70 are missing, five-year 
moving averages for spring wheat in selected Kazakhstan oblasts were recently 
published. Since spring wheat is dominant in these areas, these data were used 
to estimate yields for 1962 64 and 1969-70 on th assumption that spring wheat 
yields were the same as for all grain. 

No other all grain data are estimated except in the sense that the computer 
program computes the yields. Sometimes this causes a variation of 0.1 centner 
per hectare from the yields published in Soviet statistical handbooks. 

Spring and Winter Wheat: Data for Estonia, Latvia, Lithuania, Bclorussia, 
Moldavia, Kazakhstan, and the Ukraine are taken from various statistical abstracts. 
The statistical handbooks for the RSFSR normally publish data for all wheat only. 
In order to estimate data fur winter and spring wheat separately, additional 
sources and estimation methods had to be used. Several estimation procedures 
were used, depending on the data available. 

The best estimates are for the years 1960, 1962, and 1963. The only data 
missing are the harvested crop by administrative division, which can be estimated 
by, 

( 1 ) I Iarvest = Area X Yield 

for each division. To adjust for errors introduced by the use of rounded data, 
published data on spring and winter wheat by economic region were used. The 
actual harvested crop for an economic region should equal the sum of the calcu- 
lated harvested crops for the divisions within that region, and the published 
data on total wheat for each division should equal the sum of the calculated 
winter wheat harvest and the calculated spring wheat harvest. The calculated 
harvest data were adjusted to meet these conditions while remaining within the 
range of possible rounding error. 

The estimates for 1958 and 1971 use essentially the same method. For 1958, 
exact data on the harvested crop by economic region are not available. It was 
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estimated using data on area and yields by economic region. Hence (lie adjust- 
ments to Hie calculated harvest are not as accurate. For 1971, data on the 
harvested area of winter and spring wheat are not available by economic region or 
by administrative division. Data on the harvested crop are available; hence 
equation (1) can be rearranged as 


( 2 ) 


Area = I larvest -r- Yield 


in order to estimate sown areas. The estimated sown 
the same type oi check as described above. 


aieas were then refined using 


For 1965-69, data for spring or winter wheat have not been reported by 
administrative division, for the harvested crop or the harvested area. Equations 
(3) and (4) were solved simultaneously to estimate these data for each ad- 
ministrative division. 


( 3 ) II=(YS)X(AS) + (YW)X(AW) 

(4) A=AS+AW 
where: 

11= total wheat harvest 
YS= yield of spring wheat 
YW= yield of winter wheat 
AS = sown area of spring wheat 
A W 22 sown area of winter wheat 
A = total sown area of wheat 

II, A, YS, and YW are known. The solution is: 


(5) 

( 6 ) 


A\V= 


II— (A) X ( YS) 
YW— YS 


AS=A-AW 


Because the denominator of equation (5) is the difference of two numbers of 
similar magnitudes, these estimates are susceptible to large errors. The refine- 
ment procedures discussed above were also used for these calculations, hut had 
a much gieater impact because larger errors were encountered. 


In order to gauge the accuracy of this method, equations (5) and (6) were 
computed for all years 1958-69. Direct comparisons with published data for 1958 
1960, 1962, and 1963 showed that as long as the difference between spring and 
winter wheat yields the denominator of (5) — was not close to zero, the estimates 
were quite good. The refinements using control totals for economic regions and 
the relative stability over time of sown areas provide additional confidence in the 
estimates. 


The situation for 1961 and 1964 is similar to that for 1965-69, except that con- 
trol totals for harvested area by economic region are not available. But the har- 
vested crop and yield together imply an estimated harvested area for economic 
regions that is sufficiently accurate to provide usable refinements of estim itcs 
for administrative divisions. For 1959, there are data by economic region for 
harvested area but not for harvested crops. As for 1961 and 1964, crops can be 
estimated using data on yields and harvested area by economic region. 

Data for 1970 were the hardest to estimate. Data for winter and spring wheat 
have not been reported separately by oblast, only for total wb c. But the source 
of data for 1971 also gives 1971 results as a percent of the 1966-70 average. From 


16 

Approved For Release 2000/09/14 : CIA-RDP86T00608R000500230008-5 


Approved For Release 2000/09/14 : CIA-RDP86T00608R000500230008-5 

this information, the average yields and average; harvested crops could be calcu- 
lated lor 1966-70. I lie estimates lor 1070 are then given by 

nnm 

(7) V 70 -5Y~X)V, 

i into 

UHI1I 

A 7 »=5A- S Ai 

i xm 

where: 

Yj= yield in year i 
A| = so\vn area in year i 
Y= average yield in 1966-70 
A = average sown area in 1966-70 

I hose estimates wore then refined by checking them against economic region 
data and against data lor total wheat by administrative division. In most cases 
the estimates did not have ij be substantially revised. 

Spring Barley: Data on spring barley are not published consistently. The 
statistical handbooks for the Ukraine regularly publish data by oblast, but other 
handbooks normally do not. The data for the Baltic districts, Bclorussia, and 
Moldavia come mostly bom the agricultural handbook Scl’xkoye k' ozyaystvo 
(1970) and are directly available as yields in centners per hectare. Data for 
Kazakhstan are not available for the years 1962-64 and 1969-71. 

Data for the RSFSR arc available at the administrative division level by 
yield only for the years 1958-69. Weights were estimated in order to combine 
the yields of the divisions within a crop region. They are based on data on 
sown area derived from a Soviet journal article and from a Soviet agricultural 
atlas. 1 The only data available for 1970 and 1971 are for economic regions. They 
are usable for two regions, the Central and the Volga-Vyatsk, which are very 
close in definition to crop districts 13 and 14. The weighted averages for each 
crop district were computed and the results entered in the place of one of the 
divisions in each crop district. 

Winter Rye: Data by oblast for the RSFSR, which has 75 f ,7 of the area sown 
to winter rye, have been published only through 1969. Oblast data for the Ukraine 
(loss than 10% of the total area sown to winter rye) are available only since 1965. 
Reasonable estimates for 1958-64 can be made at the crop region level, however, 
from the data for economic regions which are published annually. Various 
republic handbooks and the USSR handbook supply the data for the remaining 
crop regions. 


n Zcmovoijc khazyaystvo. No. 3, 1972, p. 17-20 and Atlas scl'sko khozyaystva SSSIi, 
Moscow, 1900. 
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